Dataset statistics
| Number of variables | 39 |
|---|---|
| Number of observations | 347469 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 103.4 MiB |
| Average record size in memory | 312.0 B |
Variable types
| BOOL | 22 |
|---|---|
| NUM | 9 |
| CAT | 8 |
building_id has unique values | Unique |
geo_level_1_id has 5358 (1.5%) zeros | Zeros |
age has 34725 (10.0%) zeros | Zeros |
count_families has 27937 (8.0%) zeros | Zeros |
Reproduction
| Analysis started | 2020-09-29 02:10:49.731631 |
|---|---|
| Analysis finished | 2020-09-29 02:12:06.114848 |
| Duration | 1 minute and 16.38 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 347469 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 525913.5838 |
|---|---|
| Minimum | 4 |
| Maximum | 1052934 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 4 |
|---|---|
| 5-th percentile | 52200.8 |
| Q1 | 261999 |
| median | 526071 |
| Q3 | 789588 |
| 95-th percentile | 1000694 |
| Maximum | 1052934 |
| Range | 1052930 |
| Interquartile range (IQR) | 527589 |
Descriptive statistics
| Standard deviation | 304354.4791 |
|---|---|
| Coefficient of variation (CV) | 0.5787157595 |
| Kurtosis | -1.201737909 |
| Mean | 525913.5838 |
| Median Absolute Deviation (MAD) | 263777 |
| Skewness | 0.001061379559 |
| Sum | 1.827386671e+11 |
| Variance | 9.263164894e+10 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1052670 | 1 | < 0.1% | |
| 784426 | 1 | < 0.1% | |
| 682008 | 1 | < 0.1% | |
| 161818 | 1 | < 0.1% | |
| 684059 | 1 | < 0.1% | |
| 153630 | 1 | < 0.1% | |
| 151583 | 1 | < 0.1% | |
| 415271 | 1 | < 0.1% | |
| 768034 | 1 | < 0.1% | |
| 259653 | 1 | < 0.1% | |
| Other values (347459) | 347459 | > 99.9% |
| Value | Count | Frequency (%) | |
| 4 | 1 | < 0.1% | |
| 7 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 12 | 1 | < 0.1% | |
| 13 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1052934 | 1 | < 0.1% | |
| 1052931 | 1 | < 0.1% | |
| 1052929 | 1 | < 0.1% | |
| 1052926 | 1 | < 0.1% | |
| 1052923 | 1 | < 0.1% |
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.89731458 |
|---|---|
| Minimum | 0 |
| Maximum | 30 |
| Zeros | 5358 |
| Zeros (%) | 1.5% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 7 |
| median | 12 |
| Q3 | 21 |
| 95-th percentile | 27 |
| Maximum | 30 |
| Range | 30 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 8.032596704 |
|---|---|
| Coefficient of variation (CV) | 0.5779963213 |
| Kurtosis | -1.212221228 |
| Mean | 13.89731458 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 0.2736617617 |
| Sum | 4828886 |
| Variance | 64.52260981 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 6 | 32485 | 9.3% | |
| 26 | 30002 | 8.6% | |
| 10 | 29399 | 8.5% | |
| 17 | 29265 | 8.4% | |
| 7 | 25565 | 7.4% | |
| 8 | 25465 | 7.3% | |
| 20 | 22761 | 6.6% | |
| 21 | 19944 | 5.7% | |
| 4 | 19462 | 5.6% | |
| 27 | 16786 | 4.8% | |
| Other values (21) | 96335 | 27.7% |
| Value | Count | Frequency (%) | |
| 0 | 5358 | 1.5% | |
| 1 | 3588 | 1.0% | |
| 2 | 1221 | 0.4% | |
| 3 | 9995 | 2.9% | |
| 4 | 19462 | 5.6% |
| Value | Count | Frequency (%) | |
| 30 | 3595 | 1.0% | |
| 29 | 537 | 0.2% | |
| 28 | 349 | 0.1% | |
| 27 | 16786 | 4.8% | |
| 26 | 30002 | 8.6% |
geo_level_2_id
Real number (ℝ≥0)
| Distinct | 1418 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 701.8380517 |
|---|---|
| Minimum | 0 |
| Maximum | 1427 |
| Zeros | 53 |
| Zeros (%) | < 0.1% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 69 |
| Q1 | 350 |
| median | 706 |
| Q3 | 1050 |
| 95-th percentile | 1377 |
| Maximum | 1427 |
| Range | 1427 |
| Interquartile range (IQR) | 700 |
Descriptive statistics
| Standard deviation | 412.8756742 |
|---|---|
| Coefficient of variation (CV) | 0.5882776991 |
| Kurtosis | -1.190841647 |
| Mean | 701.8380517 |
| Median Absolute Deviation (MAD) | 349 |
| Skewness | 0.02506845849 |
| Sum | 243866966 |
| Variance | 170466.3224 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 39 | 5367 | 1.5% | |
| 158 | 3317 | 1.0% | |
| 181 | 2805 | 0.8% | |
| 1387 | 2675 | 0.8% | |
| 157 | 2548 | 0.7% | |
| 363 | 2343 | 0.7% | |
| 463 | 2310 | 0.7% | |
| 673 | 2249 | 0.6% | |
| 533 | 2217 | 0.6% | |
| 883 | 2129 | 0.6% | |
| Other values (1408) | 319509 | 92.0% |
| Value | Count | Frequency (%) | |
| 0 | 53 | < 0.1% | |
| 1 | 252 | 0.1% | |
| 3 | 90 | < 0.1% | |
| 4 | 415 | 0.1% | |
| 5 | 31 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1427 | 8 | < 0.1% | |
| 1426 | 378 | 0.1% | |
| 1425 | 611 | 0.2% | |
| 1424 | 8 | < 0.1% | |
| 1423 | 3 | < 0.1% |
geo_level_3_id
Real number (ℝ≥0)
| Distinct | 11861 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6258.84676 |
|---|---|
| Minimum | 0 |
| Maximum | 12567 |
| Zeros | 4 |
| Zeros (%) | < 0.1% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 614 |
| Q1 | 3073 |
| median | 6271 |
| Q3 | 9414 |
| 95-th percentile | 11928 |
| Maximum | 12567 |
| Range | 12567 |
| Interquartile range (IQR) | 6341 |
Descriptive statistics
| Standard deviation | 3646.950564 |
|---|---|
| Coefficient of variation (CV) | 0.582687307 |
| Kurtosis | -1.213916607 |
| Mean | 6258.84676 |
| Median Absolute Deviation (MAD) | 3173 |
| Skewness | 0.0005991846632 |
| Sum | 2174755225 |
| Variance | 13300248.41 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 9133 | 856 | 0.2% | |
| 633 | 841 | 0.2% | |
| 621 | 701 | 0.2% | |
| 11246 | 633 | 0.2% | |
| 11440 | 614 | 0.2% | |
| 2005 | 613 | 0.2% | |
| 7723 | 594 | 0.2% | |
| 9229 | 516 | 0.1% | |
| 2452 | 447 | 0.1% | |
| 10445 | 402 | 0.1% | |
| Other values (11851) | 341252 | 98.2% |
| Value | Count | Frequency (%) | |
| 0 | 4 | < 0.1% | |
| 1 | 6 | < 0.1% | |
| 2 | 2 | < 0.1% | |
| 3 | 13 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 12567 | 2 | < 0.1% | |
| 12566 | 2 | < 0.1% | |
| 12565 | 8 | < 0.1% | |
| 12564 | 7 | < 0.1% | |
| 12563 | 32 | < 0.1% |
count_floors_pre_eq
Real number (ℝ≥0)
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.130578555 |
|---|---|
| Minimum | 1 |
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 3 |
| Maximum | 9 |
| Range | 8 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.72776061 |
|---|---|
| Coefficient of variation (CV) | 0.3415788675 |
| Kurtosis | 2.36001261 |
| Mean | 2.130578555 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.8418180575 |
| Sum | 740310 |
| Variance | 0.5296355054 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 2 | 209029 | 60.2% | |
| 3 | 74171 | 21.3% | |
| 1 | 53705 | 15.5% | |
| 4 | 7186 | 2.1% | |
| 5 | 3039 | 0.9% | |
| 6 | 283 | 0.1% | |
| 7 | 52 | < 0.1% | |
| 8 | 3 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 53705 | 15.5% | |
| 2 | 209029 | 60.2% | |
| 3 | 74171 | 21.3% | |
| 4 | 7186 | 2.1% | |
| 5 | 3039 | 0.9% |
| Value | Count | Frequency (%) | |
| 9 | 1 | < 0.1% | |
| 8 | 3 | < 0.1% | |
| 7 | 52 | < 0.1% | |
| 6 | 283 | 0.1% | |
| 5 | 3039 | 0.9% |
| Distinct | 42 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.53881353 |
|---|---|
| Minimum | 0 |
| Maximum | 995 |
| Zeros | 34725 |
| Zeros (%) | 10.0% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 10 |
| median | 15 |
| Q3 | 30 |
| 95-th percentile | 60 |
| Maximum | 995 |
| Range | 995 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 73.52774868 |
|---|---|
| Coefficient of variation (CV) | 2.770574072 |
| Kurtosis | 157.3751623 |
| Mean | 26.53881353 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 12.19598992 |
| Sum | 9221415 |
| Variance | 5406.329825 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 10 | 51680 | 14.9% | |
| 15 | 48074 | 13.8% | |
| 5 | 45045 | 13.0% | |
| 20 | 42792 | 12.3% | |
| 0 | 34725 | 10.0% | |
| 25 | 32586 | 9.4% | |
| 30 | 23977 | 6.9% | |
| 35 | 14420 | 4.2% | |
| 40 | 14050 | 4.0% | |
| 50 | 9619 | 2.8% | |
| Other values (32) | 30501 | 8.8% |
| Value | Count | Frequency (%) | |
| 0 | 34725 | 10.0% | |
| 5 | 45045 | 13.0% | |
| 10 | 51680 | 14.9% | |
| 15 | 48074 | 13.8% | |
| 20 | 42792 | 12.3% |
| Value | Count | Frequency (%) | |
| 995 | 1851 | 0.5% | |
| 200 | 140 | < 0.1% | |
| 195 | 2 | < 0.1% | |
| 190 | 5 | < 0.1% | |
| 185 | 1 | < 0.1% |
area_percentage
Real number (ℝ≥0)
| Distinct | 86 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.017014467 |
|---|---|
| Minimum | 1 |
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 5 |
| median | 7 |
| Q3 | 9 |
| 95-th percentile | 16 |
| Maximum | 100 |
| Range | 99 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 4.388646483 |
|---|---|
| Coefficient of variation (CV) | 0.5474165602 |
| Kurtosis | 30.64344074 |
| Mean | 8.017014467 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 3.53162645 |
| Sum | 2785664 |
| Variance | 19.26021795 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 6 | 55959 | 16.1% | |
| 7 | 49140 | 14.1% | |
| 5 | 43556 | 12.5% | |
| 8 | 37988 | 10.9% | |
| 9 | 29572 | 8.5% | |
| 4 | 25675 | 7.4% | |
| 10 | 21030 | 6.1% | |
| 11 | 18390 | 5.3% | |
| 3 | 15687 | 4.5% | |
| 12 | 10148 | 2.9% | |
| Other values (76) | 40324 | 11.6% |
| Value | Count | Frequency (%) | |
| 1 | 125 | < 0.1% | |
| 2 | 4275 | 1.2% | |
| 3 | 15687 | 4.5% | |
| 4 | 25675 | 7.4% | |
| 5 | 43556 | 12.5% |
| Value | Count | Frequency (%) | |
| 100 | 1 | < 0.1% | |
| 96 | 3 | < 0.1% | |
| 92 | 3 | < 0.1% | |
| 90 | 1 | < 0.1% | |
| 86 | 7 | < 0.1% |
height_percentage
Real number (ℝ≥0)
| Distinct | 29 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.4347985 |
|---|---|
| Minimum | 2 |
| Maximum | 32 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 4 |
| median | 5 |
| Q3 | 6 |
| 95-th percentile | 9 |
| Maximum | 32 |
| Range | 30 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.915555029 |
|---|---|
| Coefficient of variation (CV) | 0.3524610948 |
| Kurtosis | 13.53489828 |
| Mean | 5.4347985 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.762884329 |
| Sum | 1888424 |
| Variance | 3.669351069 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5 | 104869 | 30.2% | |
| 6 | 61837 | 17.8% | |
| 4 | 50427 | 14.5% | |
| 7 | 47360 | 13.6% | |
| 3 | 34535 | 9.9% | |
| 8 | 18460 | 5.3% | |
| 2 | 12348 | 3.6% | |
| 9 | 7146 | 2.1% | |
| 10 | 5934 | 1.7% | |
| 12 | 1246 | 0.4% | |
| Other values (19) | 3307 | 1.0% |
| Value | Count | Frequency (%) | |
| 2 | 12348 | 3.6% | |
| 3 | 34535 | 9.9% | |
| 4 | 50427 | 14.5% | |
| 5 | 104869 | 30.2% | |
| 6 | 61837 | 17.8% |
| Value | Count | Frequency (%) | |
| 32 | 90 | < 0.1% | |
| 31 | 2 | < 0.1% | |
| 29 | 1 | < 0.1% | |
| 28 | 2 | < 0.1% | |
| 26 | 3 | < 0.1% |
land_surface_condition
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| t | |
|---|---|
| n | |
| o | 11119 |
| Value | Count | Frequency (%) | |
| t | 288937 | 83.2% | |
| n | 47413 | 13.6% | |
| o | 11119 | 3.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
foundation_type
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| r | |
|---|---|
| w | 20048 |
| u | 18908 |
| i | 14182 |
| h | 1957 |
| Value | Count | Frequency (%) | |
| r | 292374 | 84.1% | |
| w | 20048 | 5.8% | |
| u | 18908 | 5.4% | |
| i | 14182 | 4.1% | |
| h | 1957 | 0.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
roof_type
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| n | |
|---|---|
| q | |
| x | 21589 |
| Value | Count | Frequency (%) | |
| n | 243975 | 70.2% | |
| q | 81905 | 23.6% | |
| x | 21589 | 6.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
ground_floor_type
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| f | |
|---|---|
| x | |
| v | |
| z | 1334 |
| m | 704 |
| Value | Count | Frequency (%) | |
| f | 279591 | 80.5% | |
| x | 33109 | 9.5% | |
| v | 32731 | 9.4% | |
| z | 1334 | 0.4% | |
| m | 704 | 0.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
other_floor_type
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| q | |
|---|---|
| x | |
| j | |
| s | 16132 |
| Value | Count | Frequency (%) | |
| q | 220286 | 63.4% | |
| x | 58139 | 16.7% | |
| j | 52912 | 15.2% | |
| s | 16132 | 4.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
position
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| s | |
|---|---|
| t | |
| j | 17647 |
| o | 3101 |
| Value | Count | Frequency (%) | |
| s | 269463 | 77.6% | |
| t | 57258 | 16.5% | |
| j | 17647 | 5.1% | |
| o | 3101 | 0.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
plan_configuration
Categorical
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| d | |
|---|---|
| q | 7641 |
| u | 4909 |
| c | 450 |
| s | 449 |
| Other values (5) | 693 |
| Value | Count | Frequency (%) | |
| d | 333327 | 95.9% | |
| q | 7641 | 2.2% | |
| u | 4909 | 1.4% | |
| c | 450 | 0.1% | |
| s | 449 | 0.1% | |
| a | 353 | 0.1% | |
| o | 195 | 0.1% | |
| m | 64 | < 0.1% | |
| n | 54 | < 0.1% | |
| f | 27 | < 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
has_superstructure_adobe_mud
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 30915 |
| Value | Count | Frequency (%) | |
| 0 | 316554 | 91.1% | |
| 1 | 30915 | 8.9% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 264798 | 76.2% | |
| 0 | 82671 | 23.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 11941 |
| Value | Count | Frequency (%) | |
| 0 | 335528 | 96.6% | |
| 1 | 11941 | 3.4% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 6365 |
| Value | Count | Frequency (%) | |
| 0 | 341104 | 98.2% | |
| 1 | 6365 | 1.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 23621 |
| Value | Count | Frequency (%) | |
| 0 | 323848 | 93.2% | |
| 1 | 23621 | 6.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 26029 |
| Value | Count | Frequency (%) | |
| 0 | 321440 | 92.5% | |
| 1 | 26029 | 7.5% |
has_superstructure_timber
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 258995 | 74.5% | |
| 1 | 88474 | 25.5% |
has_superstructure_bamboo
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 29423 |
| Value | Count | Frequency (%) | |
| 0 | 318046 | 91.5% | |
| 1 | 29423 | 8.5% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 14791 |
| Value | Count | Frequency (%) | |
| 0 | 332678 | 95.7% | |
| 1 | 14791 | 4.3% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 5505 |
| Value | Count | Frequency (%) | |
| 0 | 341964 | 98.4% | |
| 1 | 5505 | 1.6% |
has_superstructure_other
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 5226 |
| Value | Count | Frequency (%) | |
| 0 | 342243 | 98.5% | |
| 1 | 5226 | 1.5% |
legal_ownership_status
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| v | |
|---|---|
| a | 7307 |
| w | 3539 |
| r | 1990 |
| Value | Count | Frequency (%) | |
| v | 334633 | 96.3% | |
| a | 7307 | 2.1% | |
| w | 3539 | 1.0% | |
| r | 1990 | 0.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.9837395566 |
|---|---|
| Minimum | 0 |
| Maximum | 9 |
| Zeros | 27937 |
| Zeros (%) | 8.0% |
| Memory size | 2.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.4193854935 |
|---|---|
| Coefficient of variation (CV) | 0.4263176069 |
| Kurtosis | 17.24872251 |
| Mean | 0.9837395566 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.627559333 |
| Sum | 341819 |
| Variance | 0.1758841922 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 301377 | 86.7% | |
| 0 | 27937 | 8.0% | |
| 2 | 15010 | 4.3% | |
| 3 | 2415 | 0.7% | |
| 4 | 547 | 0.2% | |
| 5 | 135 | < 0.1% | |
| 6 | 33 | < 0.1% | |
| 7 | 8 | < 0.1% | |
| 9 | 4 | < 0.1% | |
| 8 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 27937 | 8.0% | |
| 1 | 301377 | 86.7% | |
| 2 | 15010 | 4.3% | |
| 3 | 2415 | 0.7% | |
| 4 | 547 | 0.2% |
| Value | Count | Frequency (%) | |
| 9 | 4 | < 0.1% | |
| 8 | 3 | < 0.1% | |
| 7 | 8 | < 0.1% | |
| 6 | 33 | < 0.1% | |
| 5 | 135 | < 0.1% |
has_secondary_use
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 308630 | 88.8% | |
| 1 | 38839 | 11.2% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 22345 |
| Value | Count | Frequency (%) | |
| 0 | 325124 | 93.6% | |
| 1 | 22345 | 6.4% |
has_secondary_use_hotel
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 11705 |
| Value | Count | Frequency (%) | |
| 0 | 335764 | 96.6% | |
| 1 | 11705 | 3.4% |
has_secondary_use_rental
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 2827 |
| Value | Count | Frequency (%) | |
| 0 | 344642 | 99.2% | |
| 1 | 2827 | 0.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 333 |
| Value | Count | Frequency (%) | |
| 0 | 347136 | 99.9% | |
| 1 | 333 | 0.1% |
has_secondary_use_school
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 126 |
| Value | Count | Frequency (%) | |
| 0 | 347343 | > 99.9% | |
| 1 | 126 | < 0.1% |
has_secondary_use_industry
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 366 |
| Value | Count | Frequency (%) | |
| 0 | 347103 | 99.9% | |
| 1 | 366 | 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 58 |
| Value | Count | Frequency (%) | |
| 0 | 347411 | > 99.9% | |
| 1 | 58 | < 0.1% |
has_secondary_use_gov_office
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 48 |
| Value | Count | Frequency (%) | |
| 0 | 347421 | > 99.9% | |
| 1 | 48 | < 0.1% |
has_secondary_use_use_police
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 27 |
| Value | Count | Frequency (%) | |
| 0 | 347442 | > 99.9% | |
| 1 | 27 | < 0.1% |
has_secondary_use_other
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.7 MiB |
| 0 | |
|---|---|
| 1 | 1760 |
| Value | Count | Frequency (%) | |
| 0 | 345709 | 99.5% | |
| 1 | 1760 | 0.5% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| building_id | geo_level_1_id | geo_level_2_id | geo_level_3_id | count_floors_pre_eq | age | area_percentage | height_percentage | land_surface_condition | foundation_type | roof_type | ground_floor_type | other_floor_type | position | plan_configuration | has_superstructure_adobe_mud | has_superstructure_mud_mortar_stone | has_superstructure_stone_flag | has_superstructure_cement_mortar_stone | has_superstructure_mud_mortar_brick | has_superstructure_cement_mortar_brick | has_superstructure_timber | has_superstructure_bamboo | has_superstructure_rc_non_engineered | has_superstructure_rc_engineered | has_superstructure_other | legal_ownership_status | count_families | has_secondary_use | has_secondary_use_agriculture | has_secondary_use_hotel | has_secondary_use_rental | has_secondary_use_institution | has_secondary_use_school | has_secondary_use_industry | has_secondary_use_health_post | has_secondary_use_gov_office | has_secondary_use_use_police | has_secondary_use_other | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 802906 | 6 | 487 | 12198 | 2 | 30 | 6 | 5 | t | r | n | f | q | t | d | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 28830 | 8 | 900 | 2812 | 2 | 10 | 8 | 7 | o | r | n | x | q | s | d | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 94947 | 21 | 363 | 8973 | 2 | 10 | 5 | 5 | t | r | n | f | x | t | d | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 590882 | 22 | 418 | 10694 | 2 | 10 | 6 | 5 | t | r | n | f | x | s | d | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 201944 | 11 | 131 | 1488 | 3 | 30 | 8 | 9 | t | r | n | f | x | s | d | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 5 | 333020 | 8 | 558 | 6089 | 2 | 10 | 9 | 5 | t | r | n | f | q | s | d | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | v | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 6 | 728451 | 9 | 475 | 12066 | 2 | 25 | 3 | 4 | n | r | n | x | q | s | d | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 7 | 475515 | 20 | 323 | 12236 | 2 | 0 | 8 | 6 | t | w | q | v | x | s | u | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 8 | 441126 | 0 | 757 | 7219 | 2 | 15 | 8 | 6 | t | r | q | f | q | s | d | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 9 | 989500 | 26 | 886 | 994 | 1 | 0 | 13 | 4 | t | i | n | v | j | s | d | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Last rows
| building_id | geo_level_1_id | geo_level_2_id | geo_level_3_id | count_floors_pre_eq | age | area_percentage | height_percentage | land_surface_condition | foundation_type | roof_type | ground_floor_type | other_floor_type | position | plan_configuration | has_superstructure_adobe_mud | has_superstructure_mud_mortar_stone | has_superstructure_stone_flag | has_superstructure_cement_mortar_stone | has_superstructure_mud_mortar_brick | has_superstructure_cement_mortar_brick | has_superstructure_timber | has_superstructure_bamboo | has_superstructure_rc_non_engineered | has_superstructure_rc_engineered | has_superstructure_other | legal_ownership_status | count_families | has_secondary_use | has_secondary_use_agriculture | has_secondary_use_hotel | has_secondary_use_rental | has_secondary_use_institution | has_secondary_use_school | has_secondary_use_industry | has_secondary_use_health_post | has_secondary_use_gov_office | has_secondary_use_use_police | has_secondary_use_other | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 347459 | 290842 | 17 | 1149 | 5807 | 3 | 5 | 5 | 5 | t | r | n | f | q | s | d | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347460 | 330371 | 4 | 55 | 1996 | 2 | 25 | 9 | 5 | t | r | q | f | q | s | d | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347461 | 698612 | 20 | 173 | 5183 | 2 | 10 | 9 | 5 | t | w | q | f | x | s | u | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347462 | 445192 | 6 | 460 | 9258 | 2 | 5 | 14 | 6 | t | r | x | x | s | s | d | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347463 | 640115 | 7 | 1166 | 406 | 2 | 5 | 16 | 5 | t | i | x | v | s | s | d | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | v | 2 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347464 | 310028 | 4 | 605 | 3623 | 3 | 70 | 20 | 6 | t | r | q | f | q | t | d | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | w | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347465 | 663567 | 10 | 1407 | 11907 | 3 | 25 | 6 | 7 | n | r | n | f | q | s | d | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347466 | 1049160 | 22 | 1136 | 7712 | 1 | 50 | 3 | 3 | t | r | n | f | j | s | d | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347467 | 442785 | 6 | 1041 | 912 | 2 | 5 | 9 | 5 | t | r | n | f | q | s | d | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | a | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 347468 | 501372 | 26 | 36 | 6436 | 2 | 10 | 11 | 4 | t | r | q | v | q | s | d | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | v | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |